首页> 外文OA文献 >PageRank Pipeline Benchmark: Proposal for a Holistic System Benchmark for Big-Data Platforms
【2h】

PageRank Pipeline Benchmark: Proposal for a Holistic System Benchmark for Big-Data Platforms

机译:pageRank管道基准:整体系统基准的提案   适用于大数据平台

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The rise of big data systems has created a need for benchmarks to measure andcompare the capabilities of these systems. Big data benchmarks present uniquescalability challenges. The supercomputing community has wrestled with thesechallenges for decades and developed methodologies for creating rigorousscalable benchmarks (e.g., HPC Challenge). The proposed PageRank pipelinebenchmark employs supercomputing benchmarking methodologies to create ascalable benchmark that is reflective of many real-world big data processingsystems. The PageRank pipeline benchmark builds on existing prior scalablebenchmarks (Graph500, Sort, and PageRank) to create a holistic benchmark withmultiple integrated kernels that can be run together or independently. Eachkernel is well defined mathematically and can be implemented in any programmingenvironment. The linear algebraic nature of PageRank makes it well suited tobeing implemented using the GraphBLAS standard. The computations are simpleenough that performance predictions can be made based on simple computinghardware models. The surrounding kernels provide the context for each kernelthat allows rigorous definition of both the input and the output for eachkernel. Furthermore, since the proposed PageRank pipeline benchmark is scalablein both problem size and hardware, it can be used to measure and quantitativelycompare a wide range of present day and future systems. Serial implementationsin C++, Python, Python with Pandas, Matlab, Octave, and Julia have beenimplemented and their single threaded performance has been measured.
机译:大数据系统的兴起引起了对衡量和比较这些系统功能的基准的需求。大数据基准测试提出了独特的可扩展性挑战。超级计算社区数十年来一直在应对挑战,并开发了用于创建严格的可扩展基准测试的方法(例如HPC挑战)。拟议的PageRank流水线基准测试采用超级计算基准测试方法来创建可扩展的基准测试,该基准测试反映了许多现实世界中的大数据处理系统。 PageRank管道基准测试建立在现有的先前可伸缩基准(Graph500,Sort和PageRank)的基础上,以创建具有可以集成或独立运行的多个集成内核的整体基准。每个内核在数学上都有很好的定义,可以在任何编程环境中实现。 PageRank的线性代数性质使其非常适合使用GraphBLAS标准来实现。计算很简单,可以基于简单的计算硬件模型进行性能预测。周围的内核为每个内核提供了上下文,从而可以严格定义每个内核的输入和输出。此外,由于拟议的PageRank管道基准测试在问题大小和硬件上均可扩展,因此可用于测量和定量比较当今和未来的各种系统。已实现C ++,Python,带有Pandas的Python,Matlab,Octave和Julia的串行实现,并且已测量了它们的单线程性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号